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RESEARCH  OBJECTIVES 


As  in  Year  1,  our  goal  has  been  to  develop  a  computational  model  that  simulates  the 
front-end  stages  of  human  spatial  vision,  including  the  retina,  retinocortical  pathways,  and 
primary  visual  cortex  (VI).  This  computational  product  is  intended  to  be  a  functional, 
working  model,  which  processes  the  entire  stimulus  pattern  by  appropriate  algorithms  and 
can  depict  its  representation  at  each  stage  in  graphic  imagery. 

To  make  this  task  more  manageable,  we  made  important  but  noncritical  simplifica¬ 
tions.  The  model  was  confined  to  monocular,  photopic,  achromatic,  quasi-stationary 
vision.  Motion  was  considered  only  to  the  extent  that  normal  spatial  processing  requires 
minimal  eye  movements.  Binocularity  was  considered  only  by  constraining  VI  to  leave 
room  for  interleaved  right-  and  left-eye  connections. 

Important  parts  of  this  complex  system  have  been  modeled  in  other  studies.  Our  main 
goal  is  to  try  to  make  them  all  fit  together.  In  doing  that,  we  expected  to  encounter  prob¬ 
lems  that  have  not  shown  up  before.  When  this  occurs,  we  undertake  to  modify  available 
models  and  possibly  to  reinterpret  the  literature  on  which  they  are  based. 

We  achieved  our  goals  of  simulating  the  retinocortical  projection  and  integrating  it 
with  inhomogeneous  retinal  filtering  during  the  first  two  years  of  the  project  (from  1  June 
1987  to  30  May  1989).  Our  next  goal  is  to  incotporate  into  our  model  the  spatial- vision 
aspects  of  postsynaptic  processing  in  the  visual  cortex. 
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II  CURRENT  STATUS  OF  WORK 


A.  KEY  PERSONNEL 

In  November  of  1988,  Dr.  Grahame  Smith,  who  contributed  a  great  deal  to  the  first 
year  of  this  project,  left  SRI  Menlo  Park  for  Australia,  on  assignment  to  the  Australian 
Artificial  Intelligence  Institute  (A^P).  Since  we  had  considerable  warning  of  his  departure, 
we  could  compensate  for  its  effects  on  the  project.  His  role  was  expertly  taken  over  by  Dr. 
Yvan  Leclerc,  also  of  SRI's  Machine  Vision  Group.  John  Peters  continued  Grahame's 
programming  work,  and  Grahame  stayed  in  touch  by  electronic  mail  and  facsimile 
machine,  so  that  continuity  was  maintained. 

Dr.  Smith’s  departure  was  the  occasion  for  a  thorough  study,  documentation,  and 
rationalization  of  the  existing  LISP  code,  some  of  which  had  been  written  by  Smith  and 
some  by  Peters.  The  consolidation  work  by  Leclerc  and  Peters  continued  into  the  early 
months  of  1989.  It  led  to  several  changes:  most  were  minor  but  some  were  not. 

B.  RETINAL  RLTERING  AND  RETINOCORTICAL  MAPPING 

At  the  end  of  Year  1,  we  had  just  taken  up  the  question  of  choosing  a  cortical  mapping 
function,  and  Grahame  had  investigated  the  log-polar  projection  described  in  Annual 
Report  1.  This  representation  has  now  been  replaced  by  the  Schwartz  conformal  mapping 
function  (Schwartz,  1980),  discussed  below.  Although  it  involves  more  subtle  computa¬ 
tions,  the  Schwartz  function  was  chosen  as  the  best  approximation  to  the  known  physiol¬ 
ogy  (Tootell,  Switkes,  et  al.,  1988)  that  is  currently  available.  The  subroutines  for  imple¬ 
menting  it  have  been  a  main  focus  of  the  second  year’s  software  work,  as  described  below. 

1 .  Reverse-Mapping  Procedure 

There  is  a  basic  difficulty  with  computer  simulation  of  the  cortical  projection,  to  which 
Dr.  Smith  provided  a  simple  and  elegant  solution.  In  areas  of  high  cortical  magnification, 
near  the  fixation  point,  one  pixel  in  the  retina  maps  to  many  pixels  in  the  cortex.  It  would 
be  clumsy  and  time-consuming  to  project  this  pixel  from  the  retina  to  the  cortex  and  then  try 
to  find  the  adjacent  locations  that  should  be  filled  with  the  same  value. 

A  simpler  procedure  is  to  start  with  the  cortical  image,  initially  filled  with  zeros,  and 
reverse-map  these  uniformly  sampled  locations  in  the  cortex  back  to  the  retina,  in  order  to 
find  the  retinal  location  corresponding  to  each  cortical  pixel.  Appropriately  scaled  (rec¬ 
eptive-field)  filter  functions  are  then  constructed  (in  retinal  pixels),  and  centered  at  each  of 
these  cortically  defined  (nonuniform)  sample  points.  Each  filter  function  is  convolved  with 
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the  visual  scene  (sampled  in  retinal  pixels)  and  these  convolutions  provide  appropriate 
values  for  each  cortical  input  pixel.  This  is  the  way  in  which  we  now  create  our  cortical 
input  images. 

A  rough  version  of  this  reverse-mapping  code  ^as  written  before  Grahame  left,  but  it 
has  since  gone  through  many  changes,  coirections  and  additions.  We  are  confident  that  the 
present  cortical -magnification  subroutine  now  correctly  represents  the  Schwartz  projection 
(Schwartz  and  Merker,  1986).  We  chose  the  receptive-field-size  variation  independently  of 
the  cortical-mapping  function,  because  the  two  functions  are  not  identical  physiologically 
(Rovamo  and  Virsu,  1979;  Tootell,  Switkes,  et  al.,  1988;  VanDoom,  Koenderink,  et  al., 
1972).  The  variation  of  these  two  functions  with  eccentricity  in  our  model  is  illustrated  in 
Figure  1. 

2.  Details  of  the  Conformal  Projection 

Our  first  algorithm  for  the  Schwartz  projection  had  unwittingly  scaled  the  cortical 
image  independently  in  the  X  and  Y  directions,  introducing  serious  errors  in  the  aspect  ratio 
and  local  magnification.  The  interpretation  of  the  mapping  function,  log  (z-hc),  with  respect 
to  the  right  and  left  cortical  hemispheres  was  not  addressed  during  Year  1.  Setting  the  so- 
called  species  constant,  c,  equal  to  1.0  deg,  Grahame  had  obtained  single-hemisphere  pro¬ 
jections  of  the  type  shown  in  Figure  6  of  our  previous  report  (not  the  Schwartz  projection), 
but  this  gave  an  incorrect  representation  of  the  cortical  image  near  the  fovea. 

Our  present  subroutine  sets  c  =  0.3  deg,  which  makes  the  shape  of  the  projection 
approximately  that  found  in  primates  (Schwartz  and  Merker,  1986).  We  now  provide  a 
representation  of  both  hemispheres,  showing  the  visual  field  on  both  sides  of  the  fixation 
point.  Negative  locations  (in  the  right  hemispheric  representation),  corresponding  to  small, 
foveal  arguments  of  log  (z+c),  are  handled  simply  by  translating  the  entire  image  (to  the 
right),  by  the  distance  log  c.  The  subroutine  for  left-hemisphere  calculations  is  exactly  the 
mirror  image  of  the  right  one.  We  usually  juxtapose  these  two  images  at  the  fixation  point 
for  ease  of  comparison  with  the  retinal  image. 

Figure  2  provides  a  graphic  illustration  of  the  results  of  this  procedure,  using  the  con¬ 
ference-room  scene  from  Annual  Report  1.  Figure  2(a)  shows  the  retinal  image  with  three 
selected  fixation  points,  and  Figure  2(b)  shows  the  two-hemisphere  projection  correspond¬ 
ing  to  the  upper-left  fixation  point.  Both  retinal  and  cortical  filtering  have  been  removed,  to 
illustrate  clearly  the  form  of  retinocortical  projection  we  are  using.  Individual  retinal  pixels 
are  visible  in  Figure  2(b)  near  the  fovea.  Their  distribution  is  controlled  by  a  uniform  corti¬ 
cal  distribution,  as  described  above. 

Figure  2(b)  does  not  occur  as  a  stage  in  our  model,  but  it  can  be  compared  with  the 
conformal  projection  images  made  by  Schwartz,  who  did  not  take  retinal  or  cortical  filtering 
into  account  (Schwartz  and  Merker,  1986).  He  gave  no  details  of  the  construction  of  his 
two-hemisphere  images,  but  their  appearance  suggests  that  his  right-  and  left-hemispheric 
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Retinal  receptive  field  diam.  (deg) 
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1.6 


Eccentricity  (deg) 

FIGURE  1  COMPARISON  OF  RETINAL  INHOMOGENEITY  WITH  RETINOCORTICAL  PROJECTION, 
AS  FUNCTIONS  OF  ECCENTRICITY 

Heavy  line:  receptive-field  diameter  used  in  our  retinal-filtering  algorithm, 
as  a  function  of  eccentricity. 

Light  line:  inverse  of  the  retinocortical  magnification  determined  from 
log  (z  +  0.3),  averaged  over  all  meridians. 
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Inverse  retinocortical  magnification  (deg/mm) 


(a)  CONFERENCE-ROOM  SCENE,  SHOWING  THREE  FIXATION  POINTS  USED  FOR  SUBSEQUENT  PROCESSING 


(b)  UNFILTERED  CORTICAL  MAPPING  OF  SAME  SCENE,  WITH  UPPER-LEFT  FIXATION  SHOWN  IN  (a) 

FIGURE  2  EXAMPLE  OF  SCHWARTZ  CONFORMAL  PROJECTION  WITHOUT  FILTERING 
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projections  were  in  fact  joined  by  the  translation-reflection  procedure  just  described.  We 
will  occasionally  include  unfiltered  projections  such  as  Figure  2(b)  in  this  report,  for  pur¬ 
poses  of  comparison. 


C.  GABOR  nLTERING  OF  CORTICAL  IMAGES 

With  the  availability  of  correctly  computed  cortical  inputs,  we  can  now  attempt  to  sim¬ 
ulate  an  important  spatial  aspect  of  cortical  processing  sometimes  known  as  Gabor  filtering. 
We  believe  this  is  the  first  time  that  has  been  done  with  retinally  filtered,  conformally 
mapped  cortical  images. 

Two-dimensional  Gabor  functions,  which  are  one-directional  sine  waves  with  two- 
dimensional  Gaussian  envelopes,  have  previously  been  used  to  model  cortical  receptive 
fields  (Daugman,  1985),  as  have  other,  Gabor-like  functions  (such  as  Gaussian  derivatives 
or  differences  of  gaussians).  Available  data  are  probably  insufficient  to  choose  any  of 
these  alternatives  on  the  basis  of  goodness  of  fit.  We  chose  Gabor  functions  here  in  the 
same  way  that  we  chose  our  retinal  filter  functions;  the  chosen  functions  are  well  known 
and  simple  to  compute. 

1.  Details  of  the  Gabor  Computations 

This  was  the  first  part  of  our  simulation  where  we  could  assume  at  least  approximate 
homogeneity,  so  we  took  advantage  of  the  convolution  theorem.  Instead  of  convolving  a 
given  Gabor  filter  with  the  cortical  input,  we  too!:  the  (two-dimensional)  Fourier  transform 
of  both,  multiplied  these  frequency  functions  together,  and  inverse-transformed  the  produa 
to  obtain  the  desired  result.  Figure  3(a)  shows  the  frequency  spectrum  of  a  vertically  ori¬ 
ented  Gabor  function,  and  Figure  3(b)  shows  its  inverse  transform.  These  are  even- 
symmetric  functions.  Odd-symmetric  Gabor  functions  would  also  be  required  for  image 
coding  (Daugman,  1987),  but  such  Hilbert  pairs  will  not  be  treated  further  in  this  report 

Although  much  simpler  to  implement,  this  convolution  algorithm  involved  a  few 
unexpected  problems.  To  center  the  spectrum  of  the  Gabor  filter,  it  was  necessary  to  know 
the  coordinates  of  the  origin  of  the  fast  Fourier  transform  (FFT)  subroutine  in  ImagCalc™. 
As  this  FFT  had  been  used  mainly  to  obtain  power  spectra  (amplitudes  but  no  phase  infor¬ 
mation),  its  origin  was  undocumented;  however,  it  turned  out  to  be  in  the  lower  left  comer. 

Another  peculiarity  of  the  FFT  involved  scaling.  If  the  number  of  pixels  across  the 
image  is  less  tiian  2"  (where  n  =  any  integer),  any  FFT  will  fill  the  remainder  with  zeros 
before  transforming  it.  In  the  two-dimensional  case,  as  long  as  the  image  is  square,  the 
square  spectrum  does  not  result  in  scaling  problems.  But  if  the  image  is  not  square  (as 
ours  are  not)  ImagCalc™  still  keeps  its  spectrum  square;  thus  its  horizontal  and  vertical 
scale  factors  may  differ  by  a  power  of  2.  Since  the  inverse  transform  undoes  this  distor¬ 
tion,  it  has  no  effect  in  some  applications.  However,  since  our  Gabor  functions  were 
created  as  Gaussian  blobs  in  the  fi'equency  domain,  they  had  to  be  distorted  beforehand  in 
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(a)  FREQUENCY  SPECTRUM  OF  VERTICAL  GABOR  FUNCTION 
USED  IN  CORTICAL  SIMULATION 


(b)  INVERSE,  TWO-DIMENSIONAL  FOURIER  TRANSFORM  OF  (a) 

FIGURE  3  STANDARD  FORM  OF  EVEN  GABOR  FILTER 
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order  to  inverse-transform  correctly.  We  used  Gabor  functions  with  the  standard  shape  of 
Daugman's  “polar  wavelet”  family  (Daugman,1988),  with  frequency  bandwidth  twice  the 
orientation  bandwidth  (30  deg) 

2.  Interaction  Between  Retinal  and  Cortical  Filtering 

The  frequencies  and  orientations  of  the  first  Gabor  filters  we  used  were  chosen  to 
avoid  a  fundamental  theoretical  problem:  Post-synaptic  cortical  potentials  that  show  Gabor- 
like  receptive  fields  are  not  measured  relative  to  any  retinally  filtered  cortical  input,  but 
rather  with  respect  to  the  distal  stimulus  pattern,  which  is  the  definition  of  a  receptive  field 
(Daugman  1985).  Thus,  cortical  receptive  fields  necessarily  represent  the  effects  of 
consecutive  retinal,  LGN,  and  cortical  filtering.  Therefore,  even  neglecting  LGN 
effects  (as  our  model  does),  it  would  be  necessary  in  principle  to  deconvolve  the 
(Laplacian/  Gaussian)  retinal  filtering  from  the  Gabor-like  end  result,  in  order  to  see  the 
“true”  cortical  filter. 

This  retinocortical  interaction  is  not  very  apparent,  however,  partly  because  the  band- 
widths  of  the  retinal  filters  arc  considerably  greater  than  those  of  the  cortical  filters  (Kelly 
and  Burbeck  1984).  Thus,  if  we  carefully  center  our  Gabor  filters  within  the  passband  of 
the  (remapped)  retinal  filters,  their  interaction  should  be  minimal.  (In  the  limit,  i.e.,  for  an 
all-pass  retina,  deconvolution  would  have  no  effect.)  We  located  the  retinal  passband  of  our 
model  experimentally,  by  transforming  cortical  input  images  to  the  frequency  domain,  and 
chose  our  Gabor  filters  accordingly.  The  resulting  Gabor  functions  arc  the  ones  used  for 
the  Gabor-filtering  results  reported  here. 

The  efficacy  of  this  procedure  could  be  tested  by  substituting  an  all-pass  (identity)  fil¬ 
ter  for  our  retinal  filters  and  then  determining  whether  this  changes  the  Gabor-filtered  corti¬ 
cal  output  significantly.  We  plan  to  make  such  tests  during  the  next  project  period. 

D.  GRAPHIC  RESULTS 

In  this  section  we  provide  representative  illustrations  of  the  information  contained  at 
various  stages  of  our  retinocortical  model.  Because  images  like  these  have  not  been  created 
before,  it  seemed  worthwhile  to  provide  several  examples  for  each  stage.  If  it  is  true  that 
“a  picture  is  worth  10,000  words,”  they  should  contribute  greatly  to  the  brevity  of  this 
report. 

1.  Retinal-Filtered  Cortical  Inputs 

Although  our  retinal-filtering  subroutines  were  substantially  complete  in  the  first  year 
of  this  project,  tiiey  had  not  been  married  to  the  cortical-projection  technique  described 
above.  Thus  the  constnjction  of  retinal-filtered,  cortical  “input  images,”  correct  in  detail  to 
the  best  of  out  ;  wledge,  was  an  important  achievement  of  Year  2,  which  we  illustrate 

here  with  e> .  n:  '/  s  of  these  images,  and  illustrations  of  how  they  are  affected  by  changes 
of  fixation. 
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Figure  4  shows  the  retinal-filtered,  cortical  input  corresponding  to  Figure  2(a)  and 
(b).  Although  there  is  no  dc  (zero-frequency  component)  in  this  image  Oarge  areas  are  all 
the  same  shade  of  gray)  and  edges  are  rather  coarsely  represented,  surprisingly  little  infor¬ 
mation  about  the  scene  is  destroyed  by  retinal  filtering.  The  same  is  true  of  all  the  other 
cortical  inputs  illustrated  in  this  section. 

Figures  4  and  6-16  also  illustrate  that  retinocortical  magnification  in  our  model  com¬ 
pensates  for  retinal  inhomogeneity  reasonably  well,  except  in  the  vicinity  of  the  fovea, 
where  it  overcompensates  (more  magnification).  This  extra  magnification  is  shown  quanti¬ 
tatively  in  Figure  5,  obtained  by  taking  the  ratio  of  the  two  curves  plotted  in  Figure  1.  The 
cortical  distance  calculated  in  this  way  is  fairly  constant  for  eccentricities  greater  than  a  few 
degrees,  but  increases  rapidly  near  the  fovea.  This  seems  to  be  a  realistic  representation  of 
the  physiological  data  we  used  for  the  model. 

Figure  6  shows  the  projected,  filtered  and  unfiltered  images  corresponding  to  the 
lower  fixation  point  in  Figure  2(a).  Comparing  this  fixation  with  the  one  shown  in  Figures 
2(b)  and  4,  we  can  see  that  objects  far  fix>m  the  fixation  point,  although  relatively  undis¬ 
torted,  are  rotated  when  the  fixation  point  is  elevated  or  depressed.  (Note  the  person  on  the 
right  side  of  the  visual  field.) 

Figure  7  shows  the  projected,  filtered  image  corresponding  to  the  upper-right  fixation 
point  in  Figure  2(a).  Here  the  person  on  the  left  has  been  rotated  almost  90  degrees,  rela¬ 
tive  to  his  position  in  Figure  6.  When  a  human  subject  raises  or  lowers  his  direction  of 
gaze,  he  does  not  perceive  any  such  rotation  of  peripheral  objects,  of  course.  But  this 
graphically  illustrates  one  of  the  problems  involved  in  assembling  information  from  multi¬ 
ple  fixations  after  they  have  been  processed  by  VI.  Does  Gabor  filtering,  or  some  other 
known  physiological  mechanism,  facilitate  the  assembly  process?  That  is  the  chief 
remaining  question  to  be  pursued  in  this  project. 

The  peripheral-rotation  effect  could  be  quantified  by  aiming  our  model  at  a  rectilinear 
grid  or  checkerboard  target.  But  the  main  entrance  of  SRI,  shown  in  Figure  8,  serves 
equally  well.  Here  the  fixation  point  is  near  the  center  of  a  two-story  facade,  with  vertical 
columns  and  rectangular  windows.  Lines  near  the  horizontal  meridian  remain  horizontal  in 
the  cortical  image,  all  the  way  to  the  edges  of  the  40-deg  visual  field  shown  here.  But  ver¬ 
tical  lines  project  to  parabolic  arcs,  steepest  near  the  vertical  meridian  but  not  straight 
anywhere. 

We  conclude  this  section  with  similar  treatments  of  two  portraits,  familiar  if  not  stan¬ 
dard  in  the  image-coding  and  computer  vision  community.  Figiu^  9  shows  the  “Lena” 
picture,  with  three  different  fixation  points  marked  as  in  Figure  2(a).  Figures  10, 1 1,  and 
12  illustrate  cortical  inputs  for  the  center,  upper  right,  and  left  fixations,  respectively.  The 
features  just  described — extra  magnification  near  the  fovea  and  object  rotation  in  the 
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WITH  UPPER-LEFT  FIXATION  AS  IN  FIGURE  2(b) 


Calculated  cortical  distance  (mm) 


Eccentricity  (deg) 

FIGURE  5  RATIO  OF  THE  TWO  CURVES  IN  FIGURE  1 

If  cortical  magnification  exactly  compensated  for  retinal  inhomogeneity, 
this  ratio  would  be  constant. 
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FIGURE  6  CONFERENCE-ROOM  SCENE,  WITH  LOWER  FIXATION  POINT  SHOWN  IN  FIGURE  2(a) 
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FIGURE  8  BUILDING  FACADE  SCENE.  WITH  FIXATION  BETWEEN  COLUMNS 


FIGURE  9 


LENA  PORTRAIT.  SHOWING  THREE  FIXATION  POINTS 
USED  FOR  SUBSEQUENT  PROCESSING 
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FIGURE  10  LENA  PORTRAIT,  WITH  CENTER  FIXATION  POINT  SHOWN  IN  FIGURE  9 
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FIGURE  1 1  LENA  PORTRAIT,  WITH  UPPER-RIGHT  FIXATION  POINT  SHOWN  IN  FIGURE  9 
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FIGURE  1 2  LENA  PORTRAIT,  WITH  LOWER-LEFT  FIXATION  POINT  SHOWN  IN  FIGURE  9 


periphery — are  readily  apparent.Finally,  Figures  13-16  show  similar  treatments  of  the  full- 
face  mandrill  picture,  again  with  three  fixation  points.  These  four  scenes  constitute  our 
entire  library,  but  with  our  ability  to  select  arbitrary  fixation  points,  they  seem  to  be  more 
than  sufficient  for  our  plans  in  the  foreseeable  future. 

2.  Gabor-Filtered  Cortical  Outputs 

In  image-coding  studies,  the  purpose  of  Gabor  filtering  is  simply  to  reproduce  the  ori¬ 
ginal  image  by  way  of  a  code  that  uses  many  fewer  bits/pixel,  and  this  can  be  done  quite 
successfully  (Daugman,  1988).  In  our  case,  however,  the  application  is  quite  different  and 
presents  its  own  unique  problems.  Gabor  filtering  was  incorporated  in  our  model  because 
something  similar  is  known  to  be  performed  in  the  visual  cortex,  VI  (Daugman,  1980; 

Jones  and  Palmer,  1987).  Its  relation  to  retinal  inhomogeneity  and  cortical  magnification  is 
considered  here  for  the  first  time,  as  far  as  we  know.  Moreover,  in  order  to  try  to  under¬ 
stand  how  the  cortical  outputs  from  various  fixations  are  assembled,  it  seems  essential  to 
include  some  form  of  Gabor  filtering. 

Up  to  this  point  we  have  been  able  to  display  the  results  of  each  stage  of  visual  pro¬ 
cessing  graphically,  as  a  single,  two-dimensional  array  of  signal  strengths,  or  an  image. 
This  was  possible  only  because  of  the  simplifying  assumptions  introduced  at  the  start  of 
this  project.  At  this  point,  however,  we  seem  to  need  another  such  assumption  (or 
assumptions). 

The  output  of  a  general,  Gabor-filtering  process  is  not  one  image  but  several  (Daug¬ 
man,  1988),  each  corresponding  to  a  Gabor  function  of  a  particular  frequency,  symmetry, 
and  orientation,  like  the  components  of  a  cortical  hypercolumn.  How  shall  we  choose  one 
of  these  images?  Or  if  they  are  to  be  combined  in  some  way,  how  shall  we  combine  them? 
As  usual,  we  started  with  the  simplest  useful  operation  we  could  perform. 

Our  Gabor-filtered  results,  like  the  other  illustrations,  are  shown  on  a  black  back¬ 
ground  for  clarity.  However,  we  inserted  the  background  only  after  the  final  transform 
step  (see  Subsection  C.l).  During  the  FFT  computations,  we  used  a  gray  background 
equal  to  the  average  value  of  the  image,  to  eliminate  any  dc  component  Figure  17(a) 
shows  the  result  of  filtering  the  cortical  input  of  Figure  4  with  the  vertical  Gabor  function 
shown  in  Figure  3(b).  In  Figure  17(b),  the  same  image  has  been  filtered  with  a  horizontal 
Gabor  function  of  the  same  frequency.  In  Figure  18,  the  same  two  filters  have  been  applied 
to  the  image  of  Figure  6(b),  which  is  another  fixation  of  the  same  scene. 

Orientation  selectivity  is  the  most  important  property  of  Gabor  filters,  but  extracting  a 
single  orientation  in  this  way  does  not  convey  much  information  about  the  scene,  even  with 
a  Gabor  function  tuned  to  the  retinal  prefiltering  (see  Subsection  C.2).  The  most  prominent 
horizontal  and  vertical  features  in  any  one  fixation  seem  unrelated.  More  significant  for  our 
purposes,  the  most  prominent  features  in  one  fixation  seem  unrelated  to  those  in  the  other, 
for  either  orientation.  Similar  results  for  vertical  and  horizontal  filtering  at  various  fixations 
were  obtained  with  the  two  portraits,  but  these  are  not  shown  here. 
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FIGURE  14  MANDRILL  FACE,  WITH  CENTER  FIXATION  POINT  SHOWN  IN  FIGURE  13 
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FIGURE  16  MANDRILL  FACE,  WITH  LEFT-CENTER  FIXATION  POINT  SHOWN  IN  FIGURE  13 
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FIGURE  18  LINEAR  FILTERING  OF  CORTICAL  INPUT  SHOWN  IN  FIGURE  6(b)  (conference  room,  lower  fixation  point) 


One  important  reason  for  this  lack  of  correlation  is  the  peripheral-rotation  effect  discussed 
in  Subsection  D.l  and  mostly  clearly  illustrated  in  Figure  8.  Figure  19  shows  the  same 
vertical  and  horizontal  Gabor  filters  applied  to  Figure  8(b).  Note  that  the  columns  of  the 
facade,  although  vertical  in  the  original  scene,  shift  from  the  vertical-feature  image  to  the 
horizontal  one  with  increasing  eccentricity.  The  effect  is  particularly  evident  in  the  left 
hemisphere  for  this  fixation. 

These  results  are  not  really  surprising  because  this  type  of  homogeneous,  linear  oper¬ 
ation  is  not  a  very  good  model  for  the  known  physiology.  There  is  no  reason  to  expect  any 
specialized  cooperation  among  just  those  cells  from  every  orientation  column  that  happen  to 
have  the  same  orientation  in  cortical  coordinates.  A  less  arbitrary  hypothesis  might  involve 
cooperation  among  oriented  units  that  follow  the  external  coordinate  system  suggested  by 
Figure  8.  That  is  still  too  constraining,  however,  since  it  implies  a  world  made  of  hori¬ 
zontal  and  vertical  lines.  What  we  need  is  a  Gabor  filtering  process  that  responds  to  the 
input  orientations  in  an  adaptive,  nonlinear  way. 

Taking  a  leaf  fixim  the  image-coding  studies  (Daugman,  1987),  however,  we  might 
hope  that  such  a  process  could  be  constructed  from  linear,  homogeneo  is  building  blocks, 
like  those  just  described.  This  will  be  our  first  approach  in  the  coming  year. 

E.  SUMMARY  AND  FUTURE  PLANS 

In  Year  1,  we  were  mainly  concerned  with  inhomogeneous  filtering  in  the  retina;  in 
Year  2,  we  worked  with  retinocortical  projection  and  its  integration  with  retinal  inhomo¬ 
geneity.  Having  successfully  complete  the  software  for  simulating  and  displaying  the 
results  of  these  two  operations,  we  are  now  prepared  to  start  the  third  (and  most  difficult) 
phase  of  thio  project:  implementation  of  the  postsynaptic  cortical  mechanisms  of  spatial 
vision  (perhaps,  post-Vl).  We  will  start,  as  usual,  by  building  on  the  results  achieved  thus 
far,  following  some  of  the  suggestions  given  above. 
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FIGURE  19  LINEAR  FILTERING  OF  CORTICAL  INPUT  SHOWN  IN  FIGURE  8(b)  (building  facade) 
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Appendix 

CODE  STATISTICS 


All  ccxie  was  written  in  Symbolics  Common  LISP  and  run  on  a  Symbolics  3600-series  LISP 
machine.  This  code  also  made  extensive  use  of  the  ImagCalc'*’^*  vision  system. 


Cortical  transformation  code: 

853  lines 

2  minutes;  7  seconds 
5  hours,  8  minutes 
0.0996  second 


Total  lines  of  code 

Run  time  for  unfiltered  cortical  projection 
Run  time  for  filtered  cortical  projection 
Average  retinal-filter  convolution  run  time 


Gabor  filtering  code; 

Total  lines  of  code  (not  including  ImagCalc  FFT) 
Run  time  for  Gabor  convolution 


182  lines 

15  minutes,  50  seconds 


Run  times  for  the  cortical  transformation  were  measured  while  running  with  a  40-degree  field  on  a 
650-  by  496-pixel  retinal  input  image,  producing  a  743-  by  248-pixel  cortical  output  image.  The 
Gabor  convolution  run  time  was  measured  while  processing  a  743-  by  248-pixel  cortical  image. 


*ImagCalc'™  is  a  trademark  of  SRI  International. 
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